Dataset statistics
| Number of variables | 14 |
|---|---|
| Number of observations | 9850 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 1.1 MiB |
| Average record size in memory | 112.0 B |
Variable types
| Numeric | 11 |
|---|---|
| Categorical | 3 |
BirthYear is highly correlated with MonthSal and 1 other fields | High correlation |
MonthSal is highly correlated with BirthYear | High correlation |
Children is highly correlated with BirthYear | High correlation |
CustMonVal is highly correlated with ClaimsRate | High correlation |
ClaimsRate is highly correlated with CustMonVal | High correlation |
PremMotor is highly correlated with PremHousehold and 3 other fields | High correlation |
PremHousehold is highly correlated with PremMotor | High correlation |
PremHealth is highly correlated with PremMotor | High correlation |
PremLife is highly correlated with PremMotor | High correlation |
PremWork is highly correlated with PremMotor | High correlation |
CustID is uniformly distributed | Uniform |
CustID has unique values | Unique |
Reproduction
| Analysis started | 2022-01-07 17:26:33.528536 |
|---|---|
| Analysis finished | 2022-01-07 17:26:47.938705 |
| Duration | 14.41 seconds |
| Software version | pandas-profiling v3.1.0 |
| Download configuration | config.json |
| Distinct | 9850 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5146.934518 |
| Minimum | 1 |
|---|---|
| Maximum | 10296 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 77.1 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 519.45 |
| Q1 | 2571.25 |
| median | 5147.5 |
| Q3 | 7713.75 |
| 95-th percentile | 9784.55 |
| Maximum | 10296 |
| Range | 10295 |
| Interquartile range (IQR) | 5142.5 |
Descriptive statistics
| Standard deviation | 2968.364203 |
|---|---|
| Coefficient of variation (CV) | 0.5767246878 |
| Kurtosis | -1.197924516 |
| Mean | 5146.934518 |
| Median Absolute Deviation (MAD) | 2572 |
| Skewness | 0.002176056462 |
| Sum | 50697305 |
| Variance | 8811186.041 |
| Monotonicity | Strictly increasing |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 2049 | 1 | < 0.1% |
| 2756 | 1 | < 0.1% |
| 4791 | 1 | < 0.1% |
| 8889 | 1 | < 0.1% |
| 2748 | 1 | < 0.1% |
| 701 | 1 | < 0.1% |
| 6846 | 1 | < 0.1% |
| 4799 | 1 | < 0.1% |
| 8897 | 1 | < 0.1% |
| 709 | 1 | < 0.1% |
| Other values (9840) | 9840 |
| Value | Count | Frequency (%) |
| 1 | 1 | |
| 3 | 1 | |
| 4 | 1 | |
| 5 | 1 | |
| 6 | 1 | |
| 7 | 1 | |
| 8 | 1 | |
| 9 | 1 | |
| 10 | 1 | |
| 11 | 1 |
| Value | Count | Frequency (%) |
| 10296 | 1 | |
| 10295 | 1 | |
| 10294 | 1 | |
| 10292 | 1 | |
| 10290 | 1 | |
| 10289 | 1 | |
| 10288 | 1 | |
| 10287 | 1 | |
| 10286 | 1 | |
| 10285 | 1 |
FirstPolYear
Real number (ℝ≥0)
| Distinct | 25 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1986.007614 |
| Minimum | 1974 |
|---|---|
| Maximum | 1998 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 77.1 KiB |
Quantile statistics
| Minimum | 1974 |
|---|---|
| 5-th percentile | 1976 |
| Q1 | 1980 |
| median | 1986 |
| Q3 | 1992 |
| 95-th percentile | 1996 |
| Maximum | 1998 |
| Range | 24 |
| Interquartile range (IQR) | 12 |
Descriptive statistics
| Standard deviation | 6.585369078 |
|---|---|
| Coefficient of variation (CV) | 0.003315883097 |
| Kurtosis | -1.155080145 |
| Mean | 1986.007614 |
| Median Absolute Deviation (MAD) | 6 |
| Skewness | -0.02094014126 |
| Sum | 19562175 |
| Variance | 43.36708589 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=25)
| Value | Count | Frequency (%) |
| 1988 | 488 | 5.0% |
| 1986 | 469 | 4.8% |
| 1994 | 456 | 4.6% |
| 1993 | 454 | 4.6% |
| 1984 | 447 | 4.5% |
| 1989 | 443 | 4.5% |
| 1977 | 433 | 4.4% |
| 1982 | 433 | 4.4% |
| 1992 | 428 | 4.3% |
| 1990 | 427 | 4.3% |
| Other values (15) | 5372 |
| Value | Count | Frequency (%) |
| 1974 | 133 | 1.4% |
| 1975 | 271 | |
| 1976 | 415 | |
| 1977 | 433 | |
| 1978 | 427 | |
| 1979 | 422 | |
| 1980 | 415 | |
| 1981 | 422 | |
| 1982 | 433 | |
| 1983 | 413 |
| Value | Count | Frequency (%) |
| 1998 | 103 | 1.0% |
| 1997 | 248 | |
| 1996 | 426 | |
| 1995 | 424 | |
| 1994 | 456 | |
| 1993 | 454 | |
| 1992 | 428 | |
| 1991 | 418 | |
| 1990 | 427 | |
| 1989 | 443 |
| Distinct | 67 |
|---|---|
| Distinct (%) | 0.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1967.237563 |
| Minimum | 1935 |
|---|---|
| Maximum | 2001 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 77.1 KiB |
Quantile statistics
| Minimum | 1935 |
|---|---|
| 5-th percentile | 1941 |
| Q1 | 1953 |
| median | 1967 |
| Q3 | 1981 |
| 95-th percentile | 1994 |
| Maximum | 2001 |
| Range | 66 |
| Interquartile range (IQR) | 28 |
Descriptive statistics
| Standard deviation | 16.89071892 |
|---|---|
| Coefficient of variation (CV) | 0.008586008742 |
| Kurtosis | -1.129628226 |
| Mean | 1967.237563 |
| Median Absolute Deviation (MAD) | 14 |
| Skewness | 0.02629900469 |
| Sum | 19377290 |
| Variance | 285.2963856 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 1968 | 215 | 2.2% |
| 1962 | 204 | 2.1% |
| 1964 | 193 | 2.0% |
| 1953 | 192 | 1.9% |
| 1974 | 187 | 1.9% |
| 1981 | 186 | 1.9% |
| 1951 | 185 | 1.9% |
| 1977 | 185 | 1.9% |
| 1984 | 184 | 1.9% |
| 1963 | 183 | 1.9% |
| Other values (57) | 7936 |
| Value | Count | Frequency (%) |
| 1935 | 14 | 0.1% |
| 1936 | 36 | 0.4% |
| 1937 | 56 | 0.6% |
| 1938 | 75 | |
| 1939 | 100 | |
| 1940 | 127 | |
| 1941 | 145 | |
| 1942 | 160 | |
| 1943 | 156 | |
| 1944 | 168 |
| Value | Count | Frequency (%) |
| 2001 | 6 | 0.1% |
| 2000 | 15 | 0.2% |
| 1999 | 39 | 0.4% |
| 1998 | 57 | 0.6% |
| 1997 | 93 | |
| 1996 | 94 | |
| 1995 | 109 | |
| 1994 | 131 | |
| 1993 | 130 | |
| 1992 | 150 |
EducDeg
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 77.1 KiB |
| b'3 - BSc/MSc' | |
|---|---|
| b'2 - High School' | |
| b'1 - Basic' | |
| b'4 - PhD' |
Length
| Max length | 18 |
|---|---|
| Median length | 14 |
| Mean length | 14.88040609 |
| Min length | 10 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | b'2 - High School' |
|---|---|
| 2nd row | b'1 - Basic' |
| 3rd row | b'3 - BSc/MSc' |
| 4th row | b'3 - BSc/MSc' |
| 5th row | b'2 - High School' |
Common Values
| Value | Count | Frequency (%) |
| b'3 - BSc/MSc' | 4772 | |
| b'2 - High School' | 3370 | |
| b'1 - Basic' | 1012 | 10.3% |
| b'4 - PhD' | 696 | 7.1% |
Length
Histogram of lengths of the category
Pie chart
| Value | Count | Frequency (%) |
| 9850 | ||
| b'3 | 4772 | |
| bsc/msc | 4772 | |
| b'2 | 3370 | 10.2% |
| high | 3370 | 10.2% |
| school | 3370 | 10.2% |
| b'1 | 1012 | 3.1% |
| basic | 1012 | 3.1% |
| b'4 | 696 | 2.1% |
| phd | 696 | 2.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
| Distinct | 3481 |
|---|---|
| Distinct (%) | 35.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2543.724365 |
| Minimum | 333 |
|---|---|
| Maximum | 5021 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 77.1 KiB |
Quantile statistics
| Minimum | 333 |
|---|---|
| 5-th percentile | 1005 |
| Q1 | 1784 |
| median | 2542.5 |
| Q3 | 3307 |
| 95-th percentile | 4051.55 |
| Maximum | 5021 |
| Range | 4688 |
| Interquartile range (IQR) | 1523 |
Descriptive statistics
| Standard deviation | 959.9554771 |
|---|---|
| Coefficient of variation (CV) | 0.3773818776 |
| Kurtosis | -0.8725865868 |
| Mean | 2543.724365 |
| Median Absolute Deviation (MAD) | 762 |
| Skewness | -0.008129194926 |
| Sum | 25055685 |
| Variance | 921514.518 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 2501.5 | 36 | 0.4% |
| 3200 | 10 | 0.1% |
| 3560 | 10 | 0.1% |
| 3776 | 10 | 0.1% |
| 2687 | 10 | 0.1% |
| 2308 | 9 | 0.1% |
| 1924 | 9 | 0.1% |
| 2073 | 9 | 0.1% |
| 1766 | 9 | 0.1% |
| 2959 | 9 | 0.1% |
| Other values (3471) | 9729 |
| Value | Count | Frequency (%) |
| 333 | 2 | |
| 334 | 1 | < 0.1% |
| 335 | 2 | |
| 341 | 2 | |
| 344 | 1 | < 0.1% |
| 348 | 1 | < 0.1% |
| 350 | 1 | < 0.1% |
| 356 | 2 | |
| 358 | 4 | |
| 364 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 5021 | 1 | < 0.1% |
| 4995 | 1 | < 0.1% |
| 4904 | 1 | < 0.1% |
| 4897 | 1 | < 0.1% |
| 4883 | 3 | |
| 4872 | 1 | < 0.1% |
| 4869 | 1 | < 0.1% |
| 4857 | 2 | |
| 4843 | 1 | < 0.1% |
| 4797 | 1 | < 0.1% |
GeoLivArea
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 77.1 KiB |
| 4 | |
|---|---|
| 1 | |
| 3 | |
| 2 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 3 |
| 3rd row | 4 |
| 4th row | 4 |
| 5th row | 4 |
Common Values
| Value | Count | Frequency (%) |
| 4 | 3966 | |
| 1 | 2923 | |
| 3 | 1965 | |
| 2 | 996 | 10.1% |
Length
Histogram of lengths of the category
Pie chart
| Value | Count | Frequency (%) |
| 4 | 3966 | |
| 1 | 2923 | |
| 3 | 1965 | |
| 2 | 996 | 10.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 77.1 KiB |
| 1 | |
|---|---|
| 0 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 0 |
| 3rd row | 1 |
| 4th row | 1 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 1 | 6986 | |
| 0 | 2864 |
Length
Histogram of lengths of the category
Pie chart
| Value | Count | Frequency (%) |
| 1 | 6986 | |
| 0 | 2864 |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
| Distinct | 6721 |
|---|---|
| Distinct (%) | 68.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 213.3027929 |
| Minimum | -312.83 |
|---|---|
| Maximum | 1121.54 |
| Zeros | 2 |
| Zeros (%) | < 0.1% |
| Negative | 2635 |
| Negative (%) | 26.8% |
| Memory size | 77.1 KiB |
Quantile statistics
| Minimum | -312.83 |
|---|---|
| 5-th percentile | -87.727 |
| Q1 | -8.9875 |
| median | 186.265 |
| Q3 | 396.9425 |
| 95-th percentile | 615.3015 |
| Maximum | 1121.54 |
| Range | 1434.37 |
| Interquartile range (IQR) | 405.93 |
Descriptive statistics
| Standard deviation | 242.6529295 |
|---|---|
| Coefficient of variation (CV) | 1.137598464 |
| Kurtosis | -0.1400726593 |
| Mean | 213.3027929 |
| Median Absolute Deviation (MAD) | 200.595 |
| Skewness | 0.5958189796 |
| Sum | 2101032.51 |
| Variance | 58880.44418 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| -25 | 261 | 2.6% |
| -37 | 11 | 0.1% |
| -31 | 11 | 0.1% |
| -35 | 11 | 0.1% |
| -15.11 | 10 | 0.1% |
| -12.33 | 10 | 0.1% |
| -47.67 | 9 | 0.1% |
| -21.11 | 9 | 0.1% |
| -33 | 9 | 0.1% |
| -10.33 | 9 | 0.1% |
| Other values (6711) | 9500 |
| Value | Count | Frequency (%) |
| -312.83 | 1 | |
| -312.61 | 1 | |
| -312.28 | 1 | |
| -307.27 | 1 | |
| -298.91 | 1 | |
| -291.16 | 1 | |
| -280.16 | 1 | |
| -278.91 | 1 | |
| -272.27 | 1 | |
| -270.38 | 1 |
| Value | Count | Frequency (%) |
| 1121.54 | 1 | |
| 1113.78 | 1 | |
| 1109.76 | 1 | |
| 1105.42 | 1 | |
| 1103.43 | 1 | |
| 1094.44 | 1 | |
| 1094.11 | 1 | |
| 1092.88 | 1 | |
| 1090.99 | 1 | |
| 1087.53 | 1 |
| Distinct | 142 |
|---|---|
| Distinct (%) | 1.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.6796172589 |
| Minimum | 0 |
|---|---|
| Maximum | 1.62 |
| Zeros | 51 |
| Zeros (%) | 0.5% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 77.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0.16 |
| Q1 | 0.39 |
| median | 0.72 |
| Q3 | 0.98 |
| 95-th percentile | 1.09 |
| Maximum | 1.62 |
| Range | 1.62 |
| Interquartile range (IQR) | 0.59 |
Descriptive statistics
| Standard deviation | 0.3160823264 |
|---|---|
| Coefficient of variation (CV) | 0.4650887279 |
| Kurtosis | -1.193398532 |
| Mean | 0.6796172589 |
| Median Absolute Deviation (MAD) | 0.28 |
| Skewness | -0.2513727061 |
| Sum | 6694.23 |
| Variance | 0.09990803707 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 1 | 444 | 4.5% |
| 1.01 | 208 | 2.1% |
| 1.02 | 197 | 2.0% |
| 1.03 | 191 | 1.9% |
| 0.99 | 189 | 1.9% |
| 0.98 | 170 | 1.7% |
| 0.97 | 145 | 1.5% |
| 0.95 | 140 | 1.4% |
| 1.04 | 138 | 1.4% |
| 0.91 | 134 | 1.4% |
| Other values (132) | 7894 |
| Value | Count | Frequency (%) |
| 0 | 51 | |
| 0.01 | 1 | < 0.1% |
| 0.03 | 2 | < 0.1% |
| 0.04 | 5 | 0.1% |
| 0.05 | 4 | < 0.1% |
| 0.06 | 19 | 0.2% |
| 0.07 | 12 | 0.1% |
| 0.08 | 35 | |
| 0.09 | 35 | |
| 0.1 | 25 |
| Value | Count | Frequency (%) |
| 1.62 | 1 | < 0.1% |
| 1.54 | 1 | < 0.1% |
| 1.51 | 1 | < 0.1% |
| 1.49 | 1 | < 0.1% |
| 1.41 | 1 | < 0.1% |
| 1.39 | 1 | < 0.1% |
| 1.37 | 1 | < 0.1% |
| 1.36 | 1 | < 0.1% |
| 1.34 | 4 | |
| 1.33 | 1 | < 0.1% |
| Distinct | 1900 |
|---|---|
| Distinct (%) | 19.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 306.3858142 |
| Minimum | 1.78 |
|---|---|
| Maximum | 585.22 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 77.1 KiB |
Quantile statistics
| Minimum | 1.78 |
|---|---|
| 5-th percentile | 84.2895 |
| Q1 | 206.15 |
| median | 308.28 |
| Q3 | 411.41 |
| 95-th percentile | 515.54 |
| Maximum | 585.22 |
| Range | 583.44 |
| Interquartile range (IQR) | 205.26 |
Descriptive statistics
| Standard deviation | 132.2857049 |
|---|---|
| Coefficient of variation (CV) | 0.4317618467 |
| Kurtosis | -0.8754528165 |
| Mean | 306.3858142 |
| Median Absolute Deviation (MAD) | 102.24 |
| Skewness | -0.07873390618 |
| Sum | 3017900.27 |
| Variance | 17499.50773 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 298.61 | 42 | 0.4% |
| 398.74 | 17 | 0.2% |
| 361.29 | 16 | 0.2% |
| 246.49 | 16 | 0.2% |
| 206.15 | 15 | 0.2% |
| 279.61 | 15 | 0.2% |
| 409.52 | 14 | 0.1% |
| 346.51 | 13 | 0.1% |
| 381.96 | 13 | 0.1% |
| 269.94 | 13 | 0.1% |
| Other values (1890) | 9676 |
| Value | Count | Frequency (%) |
| 1.78 | 1 | |
| 3.78 | 1 | |
| 4.78 | 1 | |
| 7.67 | 1 | |
| 8.67 | 2 | |
| 9.67 | 1 | |
| 11.78 | 1 | |
| 13.56 | 1 | |
| 13.67 | 1 | |
| 14.56 | 1 |
| Value | Count | Frequency (%) |
| 585.22 | 1 | < 0.1% |
| 581.33 | 1 | < 0.1% |
| 580.11 | 1 | < 0.1% |
| 578.33 | 1 | < 0.1% |
| 577.33 | 1 | < 0.1% |
| 576.33 | 1 | < 0.1% |
| 575.44 | 2 | |
| 574.44 | 1 | < 0.1% |
| 574.33 | 1 | < 0.1% |
| 569.55 | 3 |
| Distinct | 973 |
|---|---|
| Distinct (%) | 9.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 192.4878985 |
| Minimum | -75 |
|---|---|
| Maximum | 1231.9 |
| Zeros | 59 |
| Zeros (%) | 0.6% |
| Negative | 1080 |
| Negative (%) | 11.0% |
| Memory size | 77.1 KiB |
Quantile statistics
| Minimum | -75 |
|---|---|
| 5-th percentile | -30.55 |
| Q1 | 48.35 |
| median | 127.8 |
| Q3 | 274.5 |
| 95-th percentile | 642.35 |
| Maximum | 1231.9 |
| Range | 1306.9 |
| Interquartile range (IQR) | 226.15 |
Descriptive statistics
| Standard deviation | 211.8883235 |
|---|---|
| Coefficient of variation (CV) | 1.100787765 |
| Kurtosis | 2.854263525 |
| Mean | 192.4878985 |
| Median Absolute Deviation (MAD) | 98.9 |
| Skewness | 1.614183135 |
| Sum | 1896005.8 |
| Variance | 44896.66163 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 39.45 | 61 | 0.6% |
| 19.45 | 60 | 0.6% |
| -45.55 | 60 | 0.6% |
| 0 | 59 | 0.6% |
| -30.55 | 57 | 0.6% |
| -5.55 | 56 | 0.6% |
| 69.45 | 54 | 0.5% |
| 44.45 | 53 | 0.5% |
| -40.55 | 53 | 0.5% |
| 34.45 | 53 | 0.5% |
| Other values (963) | 9284 |
| Value | Count | Frequency (%) |
| -75 | 18 | 0.2% |
| -70 | 33 | |
| -65 | 35 | |
| -60 | 31 | |
| -55 | 27 | |
| -50 | 44 | |
| -45.55 | 60 | |
| -45 | 36 | |
| -40.55 | 53 | |
| -40 | 33 |
| Value | Count | Frequency (%) |
| 1231.9 | 1 | |
| 1228 | 1 | |
| 1216.9 | 1 | |
| 1202.45 | 2 | |
| 1201.9 | 1 | |
| 1198 | 1 | |
| 1194.1 | 2 | |
| 1179.1 | 1 | |
| 1154.1 | 2 | |
| 1153.55 | 1 |
| Distinct | 1001 |
|---|---|
| Distinct (%) | 10.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 169.6226091 |
| Minimum | -2.11 |
|---|---|
| Maximum | 442.86 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 1 |
| Negative (%) | < 0.1% |
| Memory size | 77.1 KiB |
Quantile statistics
| Minimum | -2.11 |
|---|---|
| 5-th percentile | 54.9 |
| Q1 | 113.02 |
| median | 164.03 |
| Q3 | 220.82 |
| 95-th percentile | 298.39 |
| Maximum | 442.86 |
| Range | 444.97 |
| Interquartile range (IQR) | 107.8 |
Descriptive statistics
| Standard deviation | 74.26283219 |
|---|---|
| Coefficient of variation (CV) | 0.4378121087 |
| Kurtosis | -0.4292132297 |
| Mean | 169.6226091 |
| Median Absolute Deviation (MAD) | 53.9 |
| Skewness | 0.2903046845 |
| Sum | 1670782.7 |
| Variance | 5514.968244 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 162.81 | 57 | 0.6% |
| 130.47 | 30 | 0.3% |
| 159.14 | 27 | 0.3% |
| 178.7 | 27 | 0.3% |
| 158.14 | 26 | 0.3% |
| 169.7 | 26 | 0.3% |
| 219.93 | 25 | 0.3% |
| 136.58 | 25 | 0.3% |
| 146.36 | 24 | 0.2% |
| 157.03 | 24 | 0.2% |
| Other values (991) | 9559 |
| Value | Count | Frequency (%) |
| -2.11 | 1 | < 0.1% |
| 5.78 | 1 | < 0.1% |
| 7.78 | 1 | < 0.1% |
| 11.67 | 1 | < 0.1% |
| 12.67 | 1 | < 0.1% |
| 14.67 | 2 | |
| 15.56 | 1 | < 0.1% |
| 15.67 | 2 | |
| 16.56 | 4 | |
| 16.67 | 2 |
| Value | Count | Frequency (%) |
| 442.86 | 1 | |
| 440.86 | 1 | |
| 432.97 | 1 | |
| 417.3 | 1 | |
| 417.08 | 1 | |
| 408.41 | 1 | |
| 401.63 | 1 | |
| 398.41 | 1 | |
| 394.52 | 1 | |
| 393.74 | 1 |
| Distinct | 461 |
|---|---|
| Distinct (%) | 4.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 37.1919665 |
| Minimum | -7 |
|---|---|
| Maximum | 183.48 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 660 |
| Negative (%) | 6.7% |
| Memory size | 77.1 KiB |
Quantile statistics
| Minimum | -7 |
|---|---|
| 5-th percentile | -1.11 |
| Q1 | 9.89 |
| median | 24.67 |
| Q3 | 53.01 |
| 95-th percentile | 121.8 |
| Maximum | 183.48 |
| Range | 190.48 |
| Interquartile range (IQR) | 43.12 |
Descriptive statistics
| Standard deviation | 38.13965229 |
|---|---|
| Coefficient of variation (CV) | 1.025480927 |
| Kurtosis | 1.883385686 |
| Mean | 37.1919665 |
| Median Absolute Deviation (MAD) | 17.89 |
| Skewness | 1.474881573 |
| Sum | 366340.87 |
| Variance | 1454.633077 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 25.56 | 160 | 1.6% |
| 9.89 | 128 | 1.3% |
| 3.89 | 119 | 1.2% |
| 0.89 | 119 | 1.2% |
| -1.11 | 116 | 1.2% |
| 5.89 | 106 | 1.1% |
| 6.89 | 106 | 1.1% |
| 12.89 | 105 | 1.1% |
| 4.89 | 104 | 1.1% |
| 7.89 | 101 | 1.0% |
| Other values (451) | 8686 |
| Value | Count | Frequency (%) |
| -7 | 65 | |
| -6 | 61 | |
| -5 | 74 | |
| -4 | 56 | |
| -3 | 70 | |
| -2 | 72 | |
| -1.11 | 116 | |
| -1 | 54 | |
| -0.11 | 92 | |
| 0.89 | 119 |
| Value | Count | Frequency (%) |
| 183.48 | 5 | |
| 182.7 | 2 | < 0.1% |
| 182.59 | 3 | |
| 182.48 | 4 | |
| 181.7 | 1 | < 0.1% |
| 181.59 | 3 | |
| 181.48 | 1 | < 0.1% |
| 180.59 | 3 | |
| 179.7 | 1 | < 0.1% |
| 179.59 | 2 | < 0.1% |
| Distinct | 756 |
|---|---|
| Distinct (%) | 7.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 36.95146497 |
| Minimum | -12 |
|---|---|
| Maximum | 194.59 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 909 |
| Negative (%) | 9.2% |
| Memory size | 77.1 KiB |
Quantile statistics
| Minimum | -12 |
|---|---|
| 5-th percentile | -4 |
| Q1 | 10 |
| median | 25.34 |
| Q3 | 52.65 |
| 95-th percentile | 119.91 |
| Maximum | 194.59 |
| Range | 206.59 |
| Interquartile range (IQR) | 42.65 |
Descriptive statistics
| Standard deviation | 38.56906683 |
|---|---|
| Coefficient of variation (CV) | 1.043776393 |
| Kurtosis | 2.194138574 |
| Mean | 36.95146497 |
| Median Absolute Deviation (MAD) | 18.56 |
| Skewness | 1.508209591 |
| Sum | 363971.93 |
| Variance | 1487.572916 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 25.67 | 122 | 1.2% |
| 10.89 | 74 | 0.8% |
| 9.89 | 71 | 0.7% |
| 11.89 | 68 | 0.7% |
| -5.11 | 66 | 0.7% |
| -0.11 | 65 | 0.7% |
| 14.89 | 64 | 0.6% |
| 15.89 | 63 | 0.6% |
| 3.78 | 63 | 0.6% |
| 16.89 | 63 | 0.6% |
| Other values (746) | 9131 |
| Value | Count | Frequency (%) |
| -12 | 34 | |
| -11 | 31 | |
| -10 | 30 | |
| -9 | 28 | |
| -8 | 50 | |
| -7 | 39 | |
| -6.11 | 57 | |
| -6 | 44 | |
| -5.11 | 66 | |
| -5 | 48 |
| Value | Count | Frequency (%) |
| 194.59 | 1 | < 0.1% |
| 194.37 | 1 | < 0.1% |
| 194.26 | 1 | < 0.1% |
| 194.15 | 1 | < 0.1% |
| 192.37 | 3 | |
| 191.48 | 2 | |
| 191.37 | 3 | |
| 191.26 | 1 | < 0.1% |
| 189.59 | 1 | < 0.1% |
| 189.37 | 1 | < 0.1% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
First rows
| CustID | FirstPolYear | BirthYear | EducDeg | MonthSal | GeoLivArea | Children | CustMonVal | ClaimsRate | PremMotor | PremHousehold | PremHealth | PremLife | PremWork | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 1985 | 1982 | b'2 - High School' | 2177.0 | 1 | 1 | 380.97 | 0.39 | 375.85 | 79.45 | 146.36 | 47.01 | 16.89 |
| 1 | 3 | 1991 | 1970 | b'1 - Basic' | 2277.0 | 3 | 0 | 504.67 | 0.28 | 206.15 | 224.50 | 124.58 | 86.35 | 99.02 |
| 2 | 4 | 1990 | 1981 | b'3 - BSc/MSc' | 1099.0 | 4 | 1 | -16.99 | 0.99 | 182.48 | 43.35 | 311.17 | 35.34 | 28.34 |
| 3 | 5 | 1986 | 1973 | b'3 - BSc/MSc' | 1763.0 | 4 | 1 | 35.23 | 0.90 | 338.62 | 47.80 | 182.59 | 18.78 | 41.45 |
| 4 | 6 | 1986 | 1956 | b'2 - High School' | 2566.0 | 4 | 1 | -24.33 | 1.00 | 440.75 | 18.90 | 114.80 | 7.00 | 7.67 |
| 5 | 7 | 1979 | 1943 | b'2 - High School' | 4103.0 | 4 | 0 | -66.01 | 1.05 | 156.92 | 295.60 | 317.95 | 14.67 | 26.34 |
| 6 | 8 | 1988 | 1974 | b'2 - High School' | 1743.0 | 4 | 1 | -144.91 | 1.13 | 248.27 | 397.30 | 144.36 | 66.68 | 53.23 |
| 7 | 9 | 1981 | 1978 | b'3 - BSc/MSc' | 1862.0 | 1 | 1 | 356.53 | 0.36 | 344.51 | 18.35 | 210.04 | 8.78 | 9.89 |
| 8 | 10 | 1976 | 1948 | b'3 - BSc/MSc' | 3842.0 | 1 | 0 | -119.35 | 1.12 | 209.26 | 182.25 | 271.94 | 39.23 | 55.12 |
| 9 | 11 | 1990 | 1945 | b'3 - BSc/MSc' | 3995.0 | 4 | 0 | 290.17 | 0.53 | 296.50 | 116.70 | 227.71 | 18.67 | 10.89 |
Last rows
| CustID | FirstPolYear | BirthYear | EducDeg | MonthSal | GeoLivArea | Children | CustMonVal | ClaimsRate | PremMotor | PremHousehold | PremHealth | PremLife | PremWork | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 9840 | 10285 | 1980 | 1987 | b'3 - BSc/MSc' | 1504.0 | 4 | 1 | -1.55 | 0.96 | 390.63 | 29.45 | 179.70 | -6.00 | 25.67 |
| 9841 | 10286 | 1985 | 1948 | b'3 - BSc/MSc' | 3878.0 | 4 | 1 | -57.45 | 1.04 | 269.05 | 217.25 | 219.93 | 32.45 | 25.67 |
| 9842 | 10287 | 1997 | 1943 | b'3 - BSc/MSc' | 3975.0 | 2 | 0 | 220.27 | 0.62 | 285.61 | 77.25 | 241.49 | 31.45 | 8.89 |
| 9843 | 10288 | 1996 | 1941 | b'2 - High School' | 3845.0 | 4 | 0 | 99.47 | 0.90 | 87.35 | 843.50 | 121.58 | 157.92 | 33.45 |
| 9844 | 10289 | 1982 | 1993 | b'2 - High School' | 1465.0 | 1 | 1 | 795.15 | 0.35 | 67.79 | 820.15 | 102.13 | 182.48 | 86.46 |
| 9845 | 10290 | 1986 | 1943 | b'2 - High School' | 3498.0 | 4 | 0 | 245.60 | 0.67 | 227.82 | 270.60 | 160.92 | 100.13 | 69.90 |
| 9846 | 10292 | 1984 | 1949 | b'4 - PhD' | 3188.0 | 2 | 0 | -0.11 | 0.96 | 393.74 | 49.45 | 173.81 | 9.78 | 14.78 |
| 9847 | 10294 | 1994 | 1976 | b'3 - BSc/MSc' | 2918.0 | 1 | 1 | 524.10 | 0.21 | 403.63 | 132.80 | 142.25 | 12.67 | 4.89 |
| 9848 | 10295 | 1981 | 1977 | b'1 - Basic' | 1971.0 | 2 | 1 | 250.05 | 0.65 | 188.59 | 211.15 | 198.37 | 63.90 | 112.91 |
| 9849 | 10296 | 1990 | 1981 | b'4 - PhD' | 2815.0 | 1 | 1 | 463.75 | 0.27 | 414.08 | 94.45 | 141.25 | 6.89 | 12.89 |